Measurement error terminology

 

Measurement error
The Cambridge Dictionary of Statistics (4th ed)
 

Errors in reading, calculating or recording a numerical value. The difference between observed values of a variable recorded under similar conditions and some fixed true value.

 

  • I'll use "measurement error" in a broad way, also including misclassifications as a special case

PhD thesis (2016)

Prevalence of Strongyloides infection

Prevalence of Strongyloides infection

Prevalence of Strongyloides infection

Prevalence of Strongyloides infection

  • Measurement error: sensitivity and/or specificity \(< 1\)

Nutritional epidemiology

limited self-reported nutrition data ascertained with a handful of questions and self-reported items fail to acknowledge or accurately measure a system that matches or exceeds the genome in complexity

Nutritional epidemiology

Correlations between 24-h true intake and self-report (FFQ) for various nutrients varied between 0.35 and 0.51.

Measurement error in epidemiology (2001)

 

  • Review of 57 epidemiology papers found 'only' 61% mentioning measurement error in exposure

Non-reporting measurement error

Only one of these studies had an exposure (gender) that, in our opinion, could have been measured with a negligible amount of exposure measurement error

Measurement error in 2016

 

Measurement error mentioned (exposure/confounder) as possible important factor? Yes:

  • 56% in top-ranked epidemiology journals (N = 337)
    Am J Epidemiol, Epidemiology, Eur J Epidemiol, Int J Epidemiol, J Clin Epidemiol, J Epidemiol Community Health
  • 25% in top-ranked general medicine journals (N = 228)
    NEJM, JAMA, Lancet, BMJ, Ann Intern Med

Measurement error in 2016

  • 18 articles used methods to correct for or investigate impact of measurement error (3% of total, 7% of those mentioning measurement error)

  • Rife with unsubstantiated claims about anticipated impact of measurement error

  • Myths

Six myths about measurement error

  1. Measurement error is often random
  2. Measurement error can be compensated by large data
  3. Measurement error attenuates exposure effects
  4. Measurement error affects only certain types of research
  5. Measurement error can be solved by simple logical rules or averaging
  6. Measurement error can be prevented but not mitigated

1. Measurement error is often random

Random measurement error because prospective?

Self report means non-differential error?

Terminology

  • Simple situation: there is a true measurement for each individual \(i\), (\(T_i\)), and there is a corresponding error contaminated measurement (\(X_i\))

  • Classical ("random") measurement error model
    \(X_i = T_i + e_i, \quad e \sim N(0,\sigma^2)\)
  • Systematic measurement error model
    \(X_i = \lambda T_i + e, \quad e\sim N(\alpha,\tau^2)\)

Other terminology

  • Differential measurement error: the magnitude of error (error variance) depends on other key variables

  • Heteroscedastic measurement error: the magnitude of error depends on the true values of what is measured

  • Berkson measurement error: \(T_i = X_i + e_i\)

 

  • Easy to get lost in the terminology

My definition of random measurement error

Measurement error is completely random if the errors in measurement have a constant variance around their true value, are independent of this true value and other observed or unobserved patient characteristics, and the measurements with error are on the same scale as the true values (no systematic error)

Example: self reported weight (NHANES)

Example: self reported weight (NHANES)

Example: "self reporting to the mean"

Another example: body mass index

For example, among healthy male never-smokers, classifications affecting the overweight and the reference categories changed the hazard ratio for overweight from 0.85 with measured data to 1.24 with self-reported data

How about systolic blood pressure?

SBP measurement error may depend:

  • White coat effect (1)
  • Gender effect (2)
  • Number of BP measurements (2)
  • BP measurement instrument (3)
  • Adherence to measurement protocol (4)
  • etcetera.

Random measurement error?

2. Measurement error can be compensated by large data

Big is always better?

Random measurement error - observations

Random measurement error - estimating a mean

Random measurement error - linear regression

Random error - masking of relationships

"Tripple Whammy of Measurement Error"

Larger \(N\):

  • can compensate for reduced precision of exposure effect estimate due to measurement error

 

Large \(N\) has no effect on:

  • Measurement error bias in exposure effect estimate
  • Measurement error masking of functional relations

 

Larger data sets: more precisely wrong?

3. Measurement error attenuates exposure effects

Attentuation = towards the "null"

"Non-differential classification bias"

Attenuation - random measurement error in exposure

Illustration systematic measurement error in exposure

Exposure effect underestimated?

  • Attenuation bias is defined as an expectation

 

To claim attenuation bias:

  • The error is in the exposure
  • Exposure is not a categorical variable with \(>2\) categories
  • The error is classical error
  • The linear model applies (e.g. it's not logistic regression)*
  • There are no other variables measured with error in the model

Other variables measured with error

  • Cox proportional hazards models
  • Added simulated random measurement error to 1 exposure and 1 confounder variable

Other variables measured with error

Other variables measured with error

Other variables measured with error

According to Carroll

4. Measurement error affects only certain types of research

Types of research

  • Traditional epidemiologic research: observational, etiologic/therapeutic oriented

  • Randomized (controlled) trials

  • Prognostic modeling studies

Randomized controlled trials

Measurement error in trial endpoints

 

  • If measurement error is random: "only" loss of power
  • If measurement error is not random: loss of power, inflated type-I error and bias in effect estimator

How about prediction models?

Prognostic modeling studies

 

  • Predictor variables with measurement error are generally worse predictors
  • Development-validation difference in measurement error can impact predictive performance

Prognostic modeling studies

 

  • Prediction model calibration strongly affected by development-validation difference in measurement error

Genetic studies

5. Measurement error can be solved by simple logical rules or averaging

Increase the number of measurements

The idea is that multiple measurements with error of the same phenomenon are likely to increase the overall accuracy of measurement when we combine them. Often by averaging/sums or simple logical rules.

  • Multiple survey items for same construct

  • Multiple blood pressure measurements for blood pressure

  • Multiple diagnostic tests for the same disease

Accuracy of an averaging score

Accuracy of an averaging score

Example of a logical rule

 

  • A priori defined logistic rule based on multiple test results to improve disease verification

Composite reference standards

  • Composite reference standards never completely remove all of the measurement error

  • Disease verification by composite reference standard can be worse than an individual test in the rule

6. Measurement error can be prevented but not mitigated

Books

Measurement error correction approaches

Measurement of the variables with error (X) with/without truth (T) in an internal/external subset

  • Regression calibration: "X" is replaced by a prediction of "T"

  • SIMEX: Additional error is simulated and added to "X", extrapolation is used to estimate the effect without error in "X"

  • MIME: "T" is multiple imputed for individuals for whom "T" wasn't observed

  • Latent variable models: multiple measures of "X" to approximate "T"

What is still missing?

  • Measurement error in the outcome of a prediction model (no gold standard)

  • User friendly software (https://github.com/LindaNab/mecor)

  • Approaches that address complex measurement error structures in multiple variables

  • Unified approaches to simultaneously correct for measurement error, confounding and missing values

To summarize

Six myths about measurement error

Myth 1: Measurement error is often random

measurement error is seldom truly random

 

Myth 2: Measurement error can be compensated by large data

large \(N\) alleviates only one aspect of measurement error issues

 

Six myths about measurement error

Myth 3: Measurement error attenuates exposure effects

generally difficult to predict impact of measurement error without explicit theorizing and modeling

 

Myth 4: Measurement error affects only certain types of research

it is very unlikely that, if present, measurement error doesn't affect research findings

Six myths about measurement error

Myth 5: Measurement error can be solved by simple logical rules or averaging

too simple solution for a complex problem

 

Myth 6 Measurement error can be prevented but not mitigated

better to prevent than to cure measurement error. There are plenty of correction methods available if measurement error can't be prevented. We probably should start using them more

Concluding remark

The myths about measurement error contribute to a tolerant attitude towards measurement error, they seem founded on misconceptions that measurement errors have limited impact on results of epidemiologic research. Notably, the misconception that bias is towards the null effect may unrightfully reassure that one is protected against overestimating effects.